The Foundation of Hierarchy
The memory hierarchy relies on the trade-off between Static RAM (SRAM) and Dynamic RAM (DRAM). SRAM uses a 6-transistor bistable memory cell. Imagine an inverted pendulum: it is stable in two positions but metastable in the middle. This bistability makes it fast, expensive, and insensitive to disturbances. DRAM, conversely, stores bits as charge in a tiny capacitor (approx. 30 Ć 10ā»Ā¹āµ farads). Because charge leaks, DRAM is slower and requires constant refreshing.
DRAM Organization & Bus Transactions
To minimize pin count, DRAM bits are partitioned into $d$ supercells in a $r \times c$ grid where $rc=d$. Accessing data requires a two-step process: the Memory Controller sends a RAS (Row Access Strobe), moving a row to the row buffer, followed by a CAS (Column Access Strobe). This explains why sumarraycols is inherently slower: it misses the row buffer repeatedly.
Data Movement
Data travels via Bus transactions across the System Bus and Memory Bus, bridged by the I/O bridge. A movq A, %rax instruction (Read Transaction) triggers the bridge to translate the CPU's request into the DRAM's grid signals.